Skip to content

Conversation

@yhteoh
Copy link
Contributor

@yhteoh yhteoh commented Sep 23, 2025

  • Please check if the PR fulfills these requirements
  • The commit message follows our guidelines
  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)
  • What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
    Features
    Includes Support for table and document data #10

  • What is the current behavior? (You can also link to an open issue here)

  • Cannot set attributes for Datastore
  • No errors for invalid GroupBase
  • Datastore groups validated with model validator
  • No schema for groups stored in hdf5
  • No protection for special attributes
  • No support for constrained datasets, optional datasets and dictionary of datasets.
  • bidict for dtype mapping
  • No flexible shaping support on dataset
  • What is the new behavior (if this is a feature change)?
  • Datastore has attrs field
  • Renamed stored model signature in hdf5 file from model to _datastore_signature
  • GroupBase only supports additional fields of type Dataset, attrs of type Attrs and automatic determination of class_
  • Support Optional[Dataset] and Dict[str,Dataset] in Group
  • Json schema of groups stored when dumped to hdf5 as an attribute _group_schema
  • Protection for special attributes (_datastore_signature, _group_schema)
  • use enum instead of bidict for dtype mapping
  • refactor validation of Datastore and Group
  • Added constrained dataset (condataset)
  • Added CastDataset enabling automated casting of numpy arrays to a Dataset
  • Dataset can be flexibly shaped when initialized without data, when data is assigned the shape is checked for compatibility and turned into a concrete shape.
  • Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)
    No

  • Other information:
    Partially fulfills Features to add #6

@yhteoh yhteoh added the enhancement New feature or request label Sep 23, 2025
@yhteoh yhteoh self-assigned this Sep 23, 2025
@yhteoh yhteoh changed the title Additional attr Attribute handling and error handling Sep 23, 2025
@yhteoh yhteoh linked an issue Sep 23, 2025 that may be closed by this pull request
8 tasks
@yhteoh yhteoh removed a link to an issue Sep 23, 2025
8 tasks
@yhteoh yhteoh force-pushed the additional_attr branch 2 times, most recently from 47b37f3 to afa2414 Compare September 24, 2025 17:43
…onstrained dataset. Datastore.model_dump_hdf5 default mode changed to "w".
… datastore model_dump_hdf5 ignore attrs and class_ when iterating through fields
@yhteoh yhteoh force-pushed the additional_attr branch 3 times, most recently from f7a197c to 84752cb Compare October 2, 2025 17:57
yhteoh added 17 commits October 7, 2025 08:52
…et, Table, Folder) and move str dtype dump and load logic from Datastore to GroupField subclass
…rray and confolder with shape and dimension constraints
…structured to convert unstructured numpy array and dict of numpy arrays to a structured numpy array
@yhteoh yhteoh changed the title Attribute handling and error handling Attribute handling, error handling, Tables, Folders Oct 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants